Skip to content

Improve JS/TS comment filtering#81

Draft
Vladyslav-Kuksiuk wants to merge 5 commits into
improve-java-comment-filteringfrom
improve-js-comment-filtering
Draft

Improve JS/TS comment filtering#81
Vladyslav-Kuksiuk wants to merge 5 commits into
improve-java-comment-filteringfrom
improve-js-comment-filtering

Conversation

@Vladyslav-Kuksiuk

Copy link
Copy Markdown
Collaborator

This PR improves JS/TS comments filtering.

Resolves this issue.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a dedicated JavaScript/TypeScript comment filter to avoid stripping comment-like text inside template literals and regex literals, addressing the regression described in issue #67.

Changes:

  • Added a new JavaScriptCommentFilter with JS/TS-aware scanning for template literals (including ${...}) and regex literals.
  • Switched .js/.jsx/.ts/.tsx filtering from the generic marker-based filter to the new JS/TS filter and removed the old JS marker config.
  • Expanded the JS/TS test suite with cases covering templates, nested templates, and regex literals.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
embedding/commentfilter/javascript_filter.go New JS/TS-specific lexical scanner to strip real comments while preserving template/regex literal content.
embedding/commentfilter/config.go Registers JavaScriptCommentFilter for JS/TS extensions and removes the old jsSyntax marker configuration.
embedding/commentfilter/filter_test.go Adds JS/TS regression tests for templates, nested templates, and regex literals.

Comment on lines +273 to +297
// regexStartsHere reports whether the slash at the current position can start a regex literal.
func (f *javascriptLineFilter) regexStartsHere() bool {
if f.line[f.position] != '/' {
return false
}
if strings.HasPrefix(f.line[f.position:], "//") ||
strings.HasPrefix(f.line[f.position:], cStyleBlockCommentStart) {
return false
}
previous := previousSignificantToken(f.line[:f.position])
if previous == "" {
return true
}
if previous == "++" || previous == "--" {
return false
}
if regexPrecedingKeyword(previous) {
return true
}
if len(previous) != 1 {
return false
}

return strings.ContainsRune("([{=,:;!&|?+-*~^<>%", rune(previous[0]))
}
Comment on lines +345 to +366
// consumeNestedTemplateLiteral copies a template literal found inside interpolation code.
func (f *javascriptLineFilter) consumeNestedTemplateLiteral() bool {
if !f.startOrResumeNestedTemplateLiteral() {
return false
}
for f.position < len(f.line) {
switch {
case f.line[f.position] == '\\':
f.writeEscapedByte()
case f.line[f.position] == '`':
f.consumeCodeByte()
f.state.nestedTemplate = false

return true
case strings.HasPrefix(f.line[f.position:], jsTemplateInterpolationStart):
f.result.WriteString(jsTemplateInterpolationStart)
f.position += len(jsTemplateInterpolationStart)
depth := 1
f.consumeInterpolationDepth(&depth)
if depth > 0 {
return true
}
Comment on lines 50 to +54
// JavaScript
".js": filterConfig(MarkerCommentFilter{Syntax: jsSyntax}, allModes),
".jsx": filterConfig(MarkerCommentFilter{Syntax: jsSyntax}, allModes),
".ts": filterConfig(MarkerCommentFilter{Syntax: jsSyntax}, allModes),
".tsx": filterConfig(MarkerCommentFilter{Syntax: jsSyntax}, allModes),
".js": filterConfig(JavaScriptCommentFilter{}, allModes),
".jsx": filterConfig(JavaScriptCommentFilter{}, allModes),
".ts": filterConfig(JavaScriptCommentFilter{}, allModes),
".tsx": filterConfig(JavaScriptCommentFilter{}, allModes),
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants